Enforcing Relational Matching Dependencies with Datalog for Entity Resolution

نویسندگان

  • Zeinab Bahmani
  • Leopoldo E. Bertossi
چکیده

Entity resolution (ER) is about identifying and merging records in a database that represent the same real-world entity. Matching dependencies (MDs) have been introduced and investigated as declarative rules that specify ER policies. An ER process induced by MDs over a dirty instance leads to multiple clean instances, in general. General answer sets programs have been proposed to specify the MD-based cleaning task and its results. In this work, we extend MDs to relational MDs, which capture more application semantics, and identify classes of relational MDs for which the general ASP can be automatically rewritten into a stratified Datalog program, with the single clean instance as its standard model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called matching dependencies (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this...

متن کامل

MASTER OF SCIENCE Computational Mathematics and Modern Information Technologies

Entity-Relationship Data Model: Data structuring, Entity-Relationship Diagrams, Equivalence of EntityRelationship and the Functional Modeling, Algorithms for translating Entity-Relationship Diagrams into Relational and Elementary Mathematical Data Models. Relational Data Model: The structure of the Relational Data Model, Relational Algebra, Relational Calculus, Relational Query Languages, Stati...

متن کامل

Query Rewriting Using Datalog for Duplicate Resolution

Matching Dependencies (MDs) are a recent proposal for declarative entity resolution. They are rules that specify, given the similarities satisfied by values in a database, what values should be considered duplicates, and have to be matched. On the basis of a chase-like procedure for MD enforcement, we can obtain clean (duplicate-free) instances; actually possibly several of them. The clean answ...

متن کامل

A Rule-Based Approach to Analyzing Database Schema Objects with Datalog

Database schema elements such as tables, views, triggers and functions are typically defined with many interrelationships. In order to support database users in understanding a given schema, a rule-based approach for analyzing the respective dependencies is proposed using Datalog expressions. We show that many interesting properties of schema elements can be systematically determined this way. ...

متن کامل

String-Oriented Databases

Relational databases and Datalog view each attribute as indivisible. This view, though useful in several applications, does not provide a suitable database paradigm for use in genetic, multi-media or scientific databases. Data in these applications are unstructured; querying on sub-strings of attributevalues is often necessary. Moreover, due to imprecision and incompleteness in the data, approx...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017